GrAF: A Graph-based Format for Linguistic Annotations
نویسندگان
چکیده
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized using standard graph algorithms and tools. We also discuss how, as a standard graph representation, it allows for the application of well-established graph traversal and analysis algorithms to produce information about interactions and commonalities among merged annotations. GrAF is an extension of the Linguistic Annotation Framework (LAF) (Ide and Romary, 2004, 2006) developed within ISO TC37 SC4 and as such, implements state-of-the-art best practice guidelines for representing linguistic annotations.
منابع مشابه
GrAF: A Graph-based Format for Linguistic Annotations
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized ...
متن کاملA C L 2 0 0 7 The LAW Proceedings of The Linguistic Annotation Workshop
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized ...
متن کاملThe Linguistic Annotation Framework: a standard for annotation interchange and merging
This paper overviews the International Standards Organization Linguistic Annotation Framework (ISO LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples ...
متن کاملA standardized general framework for encoding and exchange of corpus annotations: The Linguistic Annotation Framework, LAF
The Linguistic Annotation Framework, LAF, proposes a generic data model for exchange of linguistic annotations and has recently become an ISO standard (ISO 24612:2012). This paper describes some aspects of LAF, its XML-serialization GrAF and some use-cases related to the framework. While GrAF has already been used as exchange format for corpora with several annotation layers, such as MASC and O...
متن کاملImporting MASC into the ANNIS linguistic database: A case study of mapping GrAF
This paper describes the importation of Manually Annotated Sub-Corpus (MASC) data and annotations into the linguistic database ANNIS, which allows users to visualize and query linguistically-annotated corpora. We outline the process of mapping MASC’s GrAF representation to ANNIS’s internal format relANNIS and demonstrate how the system provides access to multiple annotation layers in the corpus...
متن کامل